
STOP 



Early Journal Content on JSTOR, Free to Anyone in the World 

This article is one of nearly 500,000 scholarly works digitized and made freely available to everyone in 
the world byJSTOR. 

Known as the Early Journal Content, this set of works include research articles, news, letters, and other 
writings published in more than 200 of the oldest leading academic journals. The works date from the 
mid-seventeenth to the early twentieth centuries. 

We encourage people to read and share the Early Journal Content openly and to tell others that this 
resource exists. People may post this content online or redistribute in any way for non-commercial 
purposes. 

Read more about Early Journal Content at http://about.istor.org/participate-istor/individuals/early- 
journal-content . 



JSTOR is a digital library of academic journals, books, and primary source objects. JSTOR helps people 
discover, use, and build upon a wide range of content through a powerful research and teaching 
platform, and preserves this content for future generations. JSTOR is part of ITHAKA, a not-for-profit 
organization that also includes Ithaka S+R and Portico. For more information about JSTOR, please 
contact support@jstor.org. 



Vol. XVII, No. 3. Januaby 29, 1920 

The Journal of Philosophy 
Psychology and Scientific Methods 



THE NEED FOR AN EXAMINATION OP CERTAIN 
HYPOTHESES IN MENTAL TESTS. 

RELATIVE to the time and number of people devoted to work 
with mental tests, the results have been astonishingly meager 
in theoretical value. Not only do investigators in the field of mental 
tests fail to find generalizations of interpretative value in their own 
material, but writers who are eagerly searching for data and rela- 
tions worth speculating about are given scant reward for any perusal 
of the voluminous literature of mental tests. In view of the unpro- 
ductiveness of the field in propositions of fundamental significance, 
it seems worth while to examine the situation to discover possible 
causes that may explain the failure. "We must look for such causes 
in conditions basic to the field since it is not likely that any super- 
ficial errors would bring about what from a theoretical point of view 
is great waste of scientific talent. 

The fact that mental tests have some practical value does not 
account for the lack of contribution to theory, in fact one might 
suppose that the increasingly general use of tests in concrete situa- 
tions where the result really makes a difference would make the 
development of sound theory immediate and necessary. Yet one 
finds but little evidence that such stimulus of the theoretical by the 
practical is taking place. It seems to me quite contrary to our ex- 
perience of the nature of the interaction of pure and applied science 
to think that the practical usefulness of tests is limiting their 
possibility for theoretical contribution. 

I venture this explanation. Extensive collection of data through 
mental tests began without the necessary antecedent and contemporan- 
eous development of point of view, hammering out of contradictions 
in concepts and hypotheses, and elimination of ambiguities in com- 
mon everyday words and ideas. There has meanwhile grown up a 
habit of thinking about intelligence and ability which is founded, 
not upon manifestations of intelligence as we commonly experience 
them, but upon derivative facts which are the results of measure- 
ment by mental tests. These derivative facts are subject to funda- 
mental bias due to the nature of the terms in which the results of 

57 



58 THE JOURNAL OF PHILOSOPHY 

mental test performances have been expressed and due to the type 
of analysis which our limited and frequently misused statistical tech- 
nique makes possible. A further complication arises through a 
willingness to accept statistical hypotheses as applied to intelligence 
simply to have statistical technique available for use. Now these 
habits of thinking which have been grounded on misleading deriva- 
tive facts are the intellectual equipment which has been available in 
the analysis of further derivative facts. Naturally, it has been im- 
possible to arrive at propositions of theoretical importance, the tool 
of criticism being of the same substance and no more finely tempered 
than the material to which it has been applied. The piling up of 
data has therefore been of little advantage, in fact it has created a 
wilderness of tangled issues of trifling importance removing still 
further the possibility of theoretical evaluation and interpretation. 

In order to justify this position and to clear up if possible the 
obscurity of the explanation, let me give some illustrations of the 
way derivative facts, the test measurements, may mislead. I am 
afraid I shall have to take some definition of intelligence for the 
purposes of this paper, but if the reader does not like my definition, 
he may substitute any he happens to have a fondness for — the 
propositions I wish to make will hold good, I believe, for any ordi- 
nary conception. 

Let us take Stern's definition which is generally known and 
widely accepted in its main implications. "Intelligence is a gen- 
eral capacity of an individual consciously to adjust his thinking to 
new requirements : it is general mental adaptability to new problems 
and conditions of life. ' ' In spite of the assumptions that are made 
in putting the term ' ' general ' ' into the definitions, this concept will 
be useful enough here. 

If we can, let us abandon the terms and concepts which mental 
tests have given us and approach intelligence, this general mental 
capacity, as an adaptive function with which we are continuously 
in contact in our ordinary experience. One of our thought habits 
that we should be likely to question first is that general intelli- 
gence, even in quantitative terms, can be expressed as a linear or 
one-dimensional function. That is, we should question whether 
of two individuals, Henry and Henrietta, one must of necessity be 
equal to, greater than, or less than the other in general mental 
adaptability. It is interesting to see how this thought habit that 
quantitative intelligence must be a linear intelligence may have 
arisen. In measuring the performance of an individual in any 
test, the scale which we use, be it "seconds" or "correct re- 
sponses," is a linear scale; where it is not a linear scale, as when 



PSYCHOLOGY AND SCIENTIFIC METHODS 59 

time and accuracy are observed, an index is shortly forthcoming 
which is linear. These measures, in terms of linear scales, become 
symbolic of the individual's performance, and are in this sense 
derivative facts introducing the bias of the linear scale into the 
comparisons of the abilities of various individuals. When the 
results of several tests are combined, as for example, in the Binet 
series or the Army Intelligence tests, the standing in the combina- 
tion is again expressed in terms of a linear scale, not because we 
have analyzed our concept of and experiences with general intelli- 
gence and have found it so expressible, but because our common 
methods of test measurement and combination preclude any other 
result. 

I am inclined to think that intelligence may best be thought of 
quantitatively as multi-dimensional, a somewhat different thing from 
multi-focal ; and that general intelligence may be expressed as posi- 
tion in multi-dimensional space. I do not wish to enlarge on this 
point of view at this time, except to indicate how even though in- 
telligence be multi-dimensional, a linear statement might serve with 
considerable success for practical purposes as it has done to a very 
real extent. 

In talking about the size of individuals we are able to distinguish 
well enough between large and small men, recognizing that we 
consider height and weight in making our judgments. If a man 
be tall and heavy, he is large in size ; if he be short and light, he is 
small in size. If we should combine quantitative measures of height 
and weight for these two individuals just as we combine the meas- 
urements on different tests, we should have size expressed on a 
linear scale in terms that check up well enough with the facts. If, 
however, a man be tall and light, or if he be short and heavy, and 
if we should combine these measures, we should find these two men 
to be "average" in size, a thing which, if anything, they are not. 
Size thus breaks down as a variable that can be measured in linear 
terms, because quantitatively size is at least two dimensional, and 
"general size" must be stated as position in two-dimensional space. 

The reason we can talk about men being large and men being 
small, is because of a correlation that exists between height and 
weight. But we do not deceive ourselves by thinking that size is 
an objective attribute measurable in linear terms; we never refer 
to a tall thin man as a man of average size. We have, however, 
grown into the habit of thinking that general intelligence is ex- 
pressible linearly, and in my opinion this is due, let me repeat, to 
the influence of derivative facts in shaping our concepts. General 
intelligence might better be thought of as position in multi-dimen- 



60 THE JOURNAL OF PHILOSOPHY 

sional space, just as size is considered position in two-dimensional 
space. 

Let us consider a bias of another type. If we were approaching 
the field without too definite statistical prejudices, I am inclined to 
think that we should question before we got very far the implica- 
tions of the assumption of linear regressions between test perform- 
ance and general intelligence. The assumption is made quite gen- 
erally and it has affected basically our thinking about the measure- 
ment of ability. 

It should be stated at this point that evidence of linearity con- 
sisting of the regression of test measurements on judged intelli- 
gence is ordinarily worthless. The range of ability tested is usually 
so narrow, or the method of obtaining judgments so prejudices the 
facts as quantitative measurements that the regressions observed are 
descriptions of actual conditions at second or third hand at best. 

Consider any test you please, it is fairly obvious that for certain 
ranges, either extremely high or extremely low, differences in in- 
telligence will not be paralleled by differences in test performance. 
Where we have a fairly objective criterion as age in maturity rela- 
tions or trade skill in trade test relations, we find gross departures 
from linearity the rule rather than the exception. 

Yet the habit of thinking of these relations in terms of correlation 
coefficients with the implied assumption of linearity is quite general. 
It is of course basic to all attempts to combine tests by the partial 
correlation method, a method that was found quite inapplicable in 
the preparation of trade tests where the regressions could be studied. 
Consequently, we are building on the sand as long as the con- 
sequences of such an assumption are not critically examined. 

One other illustration should serve, though I have no idea that 
the possibility of such illustrations is exhausted. We should prob- 
ably not admit that we, as individuals, are of the same general 
intelligence from time to time if we were very hard pressed on the 
point. We know pretty definitely that our "general mental adapta- 
bility to new problems" varies markedly from time to time and 
place to place. It varies with what we have eaten and how we have 
slept, with time of day and character of our immediate associates. 
For some people this variability is probably greater than for others. 

But an assumption of a static intelligence level is necessary 
to mental test work as it is now conceived. It may work well enough 
for practical purposes, but it is no basis for speculation. Such an 
assumption seems based on a certain degree of uniformity as found 
in testing the same individuals at different times. So much the 
worse for the tests! If we did not need such an assumption so 



PSYCHOLOGY AND SCIENTIFIC METHODS 61 

badly, we should question at once whether tests giving the same 
rating from time to time are not extremely insensitive measures of 
general mental adaptability. This bias is strengthened by the 
necessity for making such an assumption in order to use the methods 
of attenuation which have been popular. For although one can 
make allowances in studying the size of correlation coefficients for 
errors that are made in measuring a static thing, to make such 
corrections when the thing measured is unstable and variable is 
hardly permissible. 

In my opinion, the fruitlessness of the mental test field is caused 
by the persistence of such thought habits as the three I have de- 
scribed. I would not contend that the propositions I have made 
are time. I only want to show that many fundamental notions that 
color the whole test field are open to superficial criticism to say the 
least. The justification found in Stern, that just as electricity is 
measured without too precise a knowledge of electricity, intelli- 
gence can also be measured without a final theoretical groundwork, 
has carried us too far. We must examine our basic hypotheses, 
putting aside as far as possible such concepts as we have formed as 
a result of the study of derivative facts. We must abandon in 
research that we hope will be of theoretical importance, assumptions 
and methods which critical analysis shows to be faulty, painful as 
this procedure may be. It is not my intention to question the very 
real practical value of mental tests. But the usefulness of mental 
tests in concrete situations can not increase beyond a certain point 
unless, along with the activity in the field as an applied science, 
results of a speculative and interpretative value are secured. It is 
probable that many of the failures of mental tests can be traced to 
our present inadequate theoretical foundations. 

Beaedsley Rumi/. 
The Scott Compant Labobatobt. 



PROFESSOR STRONG'S THEORY OF "ESSENCE" 

T AM in agreement with so many things in the epistemological 
-L part of Professor Strong's recent volume, that I hesitate to 
put myself in the position of a critic. I should prefer to have it 
understood that I am raising certain questions of interpretation 
rather, with the design, not so much of establishing a rival point of 
view, as of clearing up ambiguities in the interests of a common 
platform. I do not feel clear to what extent, if any, Professor 
Strong really would disagree with the claims I shall here advance. 
But I do feel that there are points on which his own pronounce- 



